rank | frequency | n-gram |
---|---|---|
1 | 41697 | -и |
2 | 24339 | -о |
3 | 13936 | -а |
4 | 12342 | -н |
5 | 11419 | -ӣ |
rank | frequency | n-gram |
---|---|---|
1 | 10968 | -ро |
2 | 6779 | -ои |
3 | 6140 | -ии |
4 | 5352 | -он |
5 | 4945 | -ҳо |
rank | frequency | n-gram |
---|---|---|
1 | 5486 | -ҳои |
2 | 2343 | -они |
3 | 2232 | -ово |
4 | 2186 | -анд |
5 | 1677 | -иро |
rank | frequency | n-gram |
---|---|---|
1 | 1416 | -аҳои |
2 | 1089 | -ская |
3 | 1071 | -ский |
4 | 813 | -ҳоро |
5 | 761 | -онро |
rank | frequency | n-gram |
---|---|---|
1 | 616 | -вская |
2 | 485 | -вский |
3 | 379 | -андаи |
4 | 332 | -тарин |
5 | 324 | -нская |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings